
Cocojunk
🚀 Dive deep with CocoJunk – your destination for detailed, well-researched articles across science, technology, culture, and more. Explore knowledge that matters, explained in plain English.
Computer programming
Read the original article here.
Computer Programming: Concepts and Evolution for the 'From Scratch' Builder
This resource explores the fundamental concepts of computer programming, tracing its history and introducing key ideas essential for understanding how software interacts with hardware, particularly relevant when delving into the foundational aspects of computing.
What is Computer Programming?
At its core, computer programming is the art and science of telling a computer what to do. It involves creating a sequence of instructions that a computer can follow to perform a specific task. This sequence of instructions is called a program.
Program: A sequence of instructions written in a programming language that a computer can execute to perform a specific task.
The process of programming involves designing and implementing algorithms.
Algorithm: A step-by-step specification of a procedure or formula for solving a problem or accomplishing a task. Algorithms must be unambiguous, have clear inputs and outputs, and terminate after a finite number of steps.
Algorithms are translated into a form the computer can understand using programming languages.
Programming Language: A formal language comprising a set of instructions used to produce output. Programming languages are used to write programs. They provide a structured way for humans to write instructions that can be converted into instructions executable by a computer.
Programmers typically work with high-level programming languages which are designed to be more understandable and easier for humans to write and read. These languages use keywords and structures closer to natural language and mathematical notation. However, the computer's Central Processing Unit (CPU) doesn't directly understand high-level languages. It understands machine code.
Machine Code: The lowest-level programming language, consisting of binary or hexadecimal instructions that a computer's CPU can directly execute. Each instruction corresponds to a basic operation performed by the CPU.
The process of converting high-level code into machine code is usually done by a compiler or an interpreter. Proficient programming often requires knowledge across several areas:
- Understanding the specific problem domain (what you're trying to achieve).
- Details of the programming languages being used.
- Utilizing existing code libraries (collections of pre-written code).
- Applying specialized algorithms.
- Using formal logic to structure solutions.
Related Activities: Beyond Just Writing Code
While writing the code itself is central, computer programming is often part of a larger process called software development. Auxiliary tasks include:
- Analyzing Requirements: Understanding what the program needs to do from a user or system perspective.
- Testing: Verifying that the program works correctly under various conditions.
- Debugging: Investigating and fixing problems (bugs) in the code.
- Implementing Build Systems: Automating the process of converting source code into an executable program.
- Managing Derived Artifacts: Handling the outputs of the development process, such as the final machine code.
While sometimes "programming" is used broadly to include these, the term software development often refers to this entire lifecycle. Software engineering implies a more formal, structured approach, often using established methodologies and engineering design principles.
A Journey Through Time: History of Programming
The concept of creating programmable sequences isn't solely tied to electronic computers. Programmable devices have a rich history:
- 9th Century: The Banu Musa brothers in Persia described a programmable mechanical flute player.
- 1206: Al-Jazari, an Arab engineer, invented a programmable drum machine using pegs and cams to control rhythms.
- 1801: The Jacquard loom used punched pasteboard cards to program complex weaving patterns, demonstrating that physical media could store instructions.
These early examples show that the idea of a machine following a stored sequence of steps predates electronic computers by centuries. The "program" was physically represented (cams, pegs, cards).
- 9th Century: Algorithmic thinking also appeared in code-breaking. Al-Kindi described cryptanalysis by frequency analysis, an early example of an algorithm for deciphering coded messages.
The First Computer Program
The generally accepted "first computer program" was an algorithm published in 1843 by mathematician Ada Lovelace. This algorithm, intended for Charles Babbage's proposed Analytical Engine, was designed to calculate Bernoulli numbers. Lovelace's work was significant not only for the algorithm itself but for her insight into the Analytical Engine's potential beyond simple calculation, envisioning its use for manipulating symbols, including musical notes. Charles Babbage had also written programs for the AE earlier, in 1837.
Analytical Engine: A proposed mechanical general-purpose computer designed by Charles Babbage in the 1830s. Although never fully built in his lifetime, its design incorporated many features found in modern computers, including a "mill" (CPU), a "store" (memory), input via punched cards, and output.
From Punched Cards to Stored Programs
- 1880s: Herman Hollerith developed the concept of storing data in machine-readable form using punched cards for the US census. Later, his tabulating machines could be "programmed" for different jobs using plug boards (control panels), similar to telephone switchboards, setting connections for specific calculations.
- Late 1940s: Early electronic computers also used control panels for programming.
- 1949: A pivotal moment arrived with the concept of the stored-program computer. This architecture, notably described by John von Neumann but with contributions from others (like the Manchester Baby and EDSAC), proposed that both programs and data should reside together in the computer's memory, accessible and modifiable by the CPU.
Stored-Program Concept: The fundamental principle of modern computer architecture where instructions (the program) and data are stored together in the same memory space and can be accessed and manipulated by the central processing unit (CPU). This allows computers to be easily reprogrammed without physical rewiring.
This shift from physically reconfiguring the machine or feeding external programs one instruction at a time (like with early punched cards) to storing the program internally was revolutionary, paving the way for flexible, general-purpose computing.
The Evolution of Programming Languages
Understanding the layers of programming languages is crucial for appreciating the journey from raw hardware instructions to human-readable code.
Machine Language
In the earliest electronic computers based on the stored-program concept, programs were written directly in machine code.
- Characteristics: Consisted of the instruction set specific to a particular CPU. Often represented in binary (sequences of 0s and 1s) or sometimes octal or hexadecimal for slightly better human readability.
- Direct Execution: Machine code is the only language the CPU understands and executes directly.
- Difficulty: Extremely difficult for humans to write and read due to its low-level nature and binary representation. Each operation (like adding two numbers) and memory address had to be specified numerically.
- Machine Dependence: Machine code is tied directly to the architecture of a specific CPU. Code written for one type of CPU will not run on another without complete translation.
Assembly Language
Assembly languages were the first significant step away from raw machine code while still remaining very close to the hardware.
Assembly Language: A low-level programming language that uses mnemonic codes (short abbreviations) for machine instructions and symbolic names for memory addresses, making programs slightly easier for humans to read and write compared to raw machine code.
- Mnemonics: Instead of writing a binary code for "ADD", an assembly language programmer might write
ADD
. - Labels/Symbols: Instead of referring to a memory address by its numerical value (e.g.,
10110100
), a programmer could use a meaningful name likeTOTAL
orX
. - Assembler: A utility program called an assembler translates the assembly language code into machine code.
- Machine Dependence: Like machine code, assembly language is specific to a particular CPU's instruction set. An assembly program written for an Intel x86 processor will not work on an ARM processor.
Assembly language is still used today when direct hardware control, maximum performance, or minimal code size is critical (e.g., operating system kernels, embedded systems, device drivers).
High-Level Languages and Compilers
The development of high-level languages marked a major leap in programmer productivity and program portability.
- Purpose: To make programming simpler, more understandable, and less tied to the specific details of the underlying hardware.
- Abstraction: High-level languages provide abstraction layers. They allow programmers to think about operations and data structures in more complex and conceptual terms, rather than just raw memory locations and CPU registers. For example, instead of a sequence of low-level instructions to perform multiplication, you might simply write
result = a * b;
. - Early Development: The first tool related to compilers was the A-0 System developed by Grace Hopper in 1952. The term 'compiler' was also coined by Hopper.
- First Widely Used: FORTRAN (FORmula TRANslation), released in 1957, was the first widely used high-level language with a functional implementation. It was designed for scientific and engineering calculations.
- Other Early Examples: COBOL (Common Business-Oriented Language) was developed for business data processing, and Lisp (LISt Processor) for computer research and artificial intelligence.
- Compilation: High-level languages require a compiler to translate the source code into machine code before the program can be executed.
Compiler: A program that translates source code written in a high-level programming language into machine code or another lower-level language that can be executed by a computer. The entire program is typically translated before execution begins.
Interpreter: (Not explicitly in the source text, but important context for high-level languages) An alternative to a compiler, an interpreter executes high-level code instruction by instruction, without a separate compilation step beforehand. This can make development faster but often results in slower execution.
Compiled languages allowed programmers to write code using richer syntax and structures. Compilers harness the power of the computer itself to automate the tedious translation process, allowing programmers to focus on the logic of the program rather than the specifics of the machine instructions. While high-level languages aim for portability, achieving it perfectly can still be challenging due to differences in operating systems, hardware resources, and available compilers.
How Code Was Entered
The physical act of creating and inputting programs also evolved significantly.
- Early Methods: Programs were often written out on paper (coding sheets) and then transferred to physical media like punched cards or paper tape. Each instruction or piece of data was represented by patterns of holes. These cards or tapes were then fed into card readers or tape readers connected to the computer. Editing involved physically replacing cards or re-punching sections of tape.
- Late 1960s: As data storage (like magnetic disks) and computer terminals became more affordable, programmers could type code directly into the computer.
- Text Editors: The development of text editors revolutionized coding by allowing programmers to easily enter, modify, and save source code electronically, making the process much faster and more efficient than using punched cards.
This evolution reflects the increasing ease of interaction between the programmer and the machine, moving from laborious physical methods to direct, interactive text manipulation.
Essential Qualities of Good Programs
Regardless of the language or approach, certain qualities are universally desired in computer programs. Understanding these is key to writing effective and reliable code, whether for a simple script or a complex system.
- Reliability: How consistently the program produces correct results.
- Context: This depends on designing correct algorithms and minimizing errors in implementation. Common programming mistakes include:
- Resource Management Issues: Like buffer overflows (writing data beyond the allocated memory buffer, potentially overwriting other data or instructions) or race conditions (when the outcome of operations depends on the unpredictable timing of multiple independent processes or threads accessing shared resources).
- Logic Errors: Mistakes in the program's logic, such as attempting to divide by zero or off-by-one errors (e.g., a loop iterating one time too many or too few).
- Context: This depends on designing correct algorithms and minimizing errors in implementation. Common programming mistakes include:
- Robustness: How well the program handles unexpected situations or errors that aren't necessarily bugs in the code itself.
- Context: This includes dealing with invalid or corrupted input data, unavailability of resources (like memory or network connections), user mistakes, and hardware failures. A robust program might, for example, display a clear error message instead of crashing when given incorrect input.
- Portability: The ease with which the source code can be compiled/interpreted and run on different computer hardware and operating system platforms.
- Context: This is significantly influenced by the chosen programming language and whether it provides platform-independent features or requires platform-specific code. High-level languages generally aim for higher portability than assembly or machine code.
- Maintainability: How easy it is for current or future developers to modify the program (fix bugs, add features, adapt to new environments).
- Context: Good design, clear structure, and readable code (discussed below) are crucial for maintainability. Programs that are hard to understand are difficult and costly to change.
- Efficiency/Performance: How effectively the program uses system resources (CPU time, memory, disk I/O, network bandwidth).
- Context: Less resource usage is generally better. This involves choosing efficient algorithms and implementing them carefully. While the programming language can impact performance (lower-level languages often allow for more direct optimization), the algorithm and the implementation often have a larger effect. Sometimes, spending more programmer time to write efficient code is necessary; other times, using more hardware might be a cheaper solution than optimizing code written in a less performant language.
Using automated tests can help ensure reliability and catch regressions when changes are made, contributing to maintainability.
Readability of Source Code
Readability is a critical aspect of program quality, particularly impacting maintainability.
Readability: The ease with which a human reader can understand the purpose, logic flow, and operation of source code.
- Why it Matters: Programmers spend a significant amount of their time reading and trying to understand existing code (to fix, modify, or reuse it) rather than writing entirely new code. Unreadable code leads to misunderstandings, makes debugging harder, increases the likelihood of introducing new bugs, and can result in duplicated efforts.
- Factors Influencing Readability:
- Consistent Style: Following standard conventions for indentation, spacing, and formatting.
- Comments: Explanations embedded in the code to clarify complex parts, non-obvious logic, or the purpose of sections.
- Decomposition: Breaking down large tasks into smaller, manageable functions or modules.
- Naming Conventions: Using clear, descriptive names for variables, functions, classes, etc., that indicate their purpose.
While editors can help with formatting (indentation, syntax highlighting), the content aspects (comments, naming, structure) reflect the programmer's skill in communicating with other humans (including their future self).
Algorithmic Complexity
A foundational concept in programming, especially when performance matters, is understanding the efficiency of algorithms.
Algorithmic Complexity: A measure of the resources (such as execution time or memory usage) that an algorithm requires, typically expressed as a function of the size of its input.
- Big O Notation: A standard notation used to describe the upper bound of an algorithm's complexity in the worst-case scenario. It expresses how the resource requirement scales as the input size grows (e.g., O(n) for linear time, O(n^2) for quadratic time, O(log n) for logarithmic time).
- Importance: Expert programmers are familiar with common algorithms and their complexities (e.g., different sorting algorithms have drastically different performance characteristics on large datasets). This knowledge allows them to choose the most appropriate algorithm for a given problem and data size, directly impacting the program's efficiency.
Programming Languages in Detail
Programming languages are the tools programmers use to write instructions. They vary widely in their design, philosophy, and intended use.
- Programming Paradigms: Different languages support different styles of programming, known as paradigms (e.g., imperative, object-oriented, functional, logic).
- Choosing a Language: The choice depends on factors like the type of task, company standards, available libraries, performance needs, and programmer familiarity.
- Spectrum of Levels: Languages exist on a spectrum:
- Low-level: Machine-oriented (Machine Code, Assembly Language). Closer to hardware, faster execution (potentially), but harder to write, read, and less portable.
- High-level: More abstract (Python, Java, C++, etc.). Easier for humans to use, more portable (generally), but require compilation or interpretation and may be less efficient for specific low-level tasks.
Programming Paradigm: A fundamental style or approach to building computer programs. Examples include imperative programming (focus on sequence of statements), object-oriented programming (focus on objects and their interactions), functional programming (focus on functions and avoiding changing state), and logic programming (focus on defining rules and facts).
Despite their differences, most programming languages provide basic operations necessary for computation, as summarized by Allen Downey:
- Input: Getting data from a source (keyboard, file, network, sensor).
- Output: Sending data to a destination (screen, file, network, actuator).
- Arithmetic: Performing mathematical calculations (addition, subtraction, multiplication, division).
- Conditional Execution: Making decisions based on conditions (e.g., "IF this is true, THEN do that; ELSE do something else").
- Repetition: Performing actions multiple times (loops, iterations).
These fundamental operations are the building blocks of almost any program, regardless of how complex the task or how high-level the language.
Many languages also allow the use of libraries, which are collections of pre-written code modules that provide common functionality (like mathematical functions, input/output operations, or network communication). This prevents programmers from having to write every piece of code from scratch. As long as libraries follow calling conventions, functions from different languages can sometimes be linked together.
Learning to Program
The process of learning to program has evolved from a niche skill for scientists and engineers to a more widespread pursuit.
- Historical Context: Early programmers often learned through informal methods, user groups, books, and magazines. Academic courses and corporate training were crucial for professional roles. Pioneers like Maurice Wilkes, David Wheeler, and Stanley Gill wrote early texts with subroutines for machines like the EDSAC.
- Influence of High-Level Languages: The advent of languages like FORTRAN, COBOL, BASIC, Pascal, and C led to numerous introductory books explaining language features and programming concepts. Early primers often started with a simple "Hello, World" program.
- Key Texts: Influential books included Kemeny & Kurtz's BASIC Programming, Kernighan & Ritchie's The C Programming Language, Donald Knuth's The Art of Computer Programming (focused on algorithms), and Kernighan & Plauger's The Elements of Programming Style (emphasizing code readability).
- Mass Market Era: The rise of personal computers spurred a wave of trade books and magazines aimed at hobbyists and professionals, covering specific hardware platforms and languages (e.g., programming the Z80, IBM PC).
- Digital Age: Since the 2000s, learning resources have shifted significantly online. Blogs, wikis, videos (YouTube), online courses (Khan Academy, Codecademy), code repositories (GitHub), documentation sites (W3Schools), and coding bootcamps have become primary sources of instruction. Integrated Development Environments (IDEs) and Software Development Kits (SDKs) often include extensive digital documentation and help resources.
This historical journey shows that learning programming has always involved accessing resources, whether physical or digital, to understand the language, the machine, and the best practices for crafting instructions.
Programmers
Computer programmers are the individuals who write computer software. The perception of programming has sometimes portrayed it as purely mathematical, but research indicates that strong skills in natural language comprehension and communication are also highly valuable. Learning to code shares similarities with learning a foreign language, involving syntax, semantics, and logic.